The iNaturalist Challenge 2017 Dataset
نویسندگان
چکیده
Existing image classification datasets used in computer vision tend to have an even number of images for each object category. In contrast, the natural world is heavily imbalanced, as some species are more abundant and easier to photograph than others. To encourage further progress in challenging real world conditions we present the iNaturalist Challenge 2017 dataset an image classification benchmark consisting of 675,000 images with over 5,000 different species of plants and animals. It features many visually similar species, captured in a wide variety of situations, from all over the world. Images were collected with different camera types, have varying image quality, have been verified by multiple citizen scientists, and feature a large class imbalance. We discuss the collection of the dataset and present baseline results for state-of-the-art computer vision classification models. Results show that current non-ensemble based methods achieve only 64% top one classification accuracy, illustrating the difficulty of the dataset. Finally, we report results from a competition that was held with the data.
منابع مشابه
The iNaturalist Species Classification and Detection Dataset
Existing image classification datasets used in computer vision tend to have a uniform distribution of images across object categories. In contrast, the natural world is heavily imbalanced, as some species are more abundant and easier to photograph than others. To encourage further progress in challenging real world conditions we present the iNaturalist species classification and detection datas...
متن کاملThe iNaturalist Species Classification and Detection Dataset - Supplementary Material
We performed an experiment to understand if there was any relationship between real world animal size and prediction accuracy. Using existing records for bird [4] and mammal [2] body sizes we assigned a mass to each of the classes in iNat2017 that overlapped with these datasets. For a given species, mass will vary due to the life stage or gender of the particular individual. Here, we simply tak...
متن کاملLarge Scale Fine-Grained Categorization and Domain-Specific Transfer Learning
Transferring the knowledge learned from large scale datasets (e.g., ImageNet) via fine-tuning offers an effective solution for domain-specific fine-grained visual categorization (FGVC) tasks (e.g., recognizing bird species or car make & model). In such scenarios, data annotation often calls for specialized domain knowledge and thus is difficult to scale. In this work, we first tackle a problem ...
متن کاملAutomatic segmentation of glioma tumors from BraTS 2018 challenge dataset using a 2D U-Net network
Background: Glioma is the most common primary brain tumor, and early detection of tumors is important in the treatment planning for the patient. The precise segmentation of the tumor and intratumoral areas on the MRI by a radiologist is the first step in the diagnosis, which, in addition to the consuming time, can also receive different diagnoses from different physicians. The aim of this study...
متن کاملThe HASYv2 dataset
This paper describes the HASY dataset of handwritten symbols. HASY is a publicly available, free of charge dataset of single symbols similar to MNIST. It contains 168 233 instances of 369 classes. HASY contains two challenges: A classification challenge with 10 pre-defined folds for 10-fold cross-validation and a verification challenge.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1707.06642 شماره
صفحات -
تاریخ انتشار 2017